AITopics | logic puzzle

Collaborating Authors

logic puzzle

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Bridging Natural Language and ASP: A Hybrid Approach Using LLMs and AMR Parsing

Hite, Connar, Saud, Sean, Taha, Raef, Rahman, Nayim, Atahary, Tanvir, Douglass, Scott, Taha, Tarek

arXiv.org Artificial IntelligenceNov-13-2025

Answer Set Programming (ASP) is a declarative programming paradigm based on logic programming and non-monotonic reasoning. It is a tremendously powerful tool for describing and solving combinatorial problems. Like any other language, ASP requires users to learn how it works and the syntax involved. It is becoming increasingly required for those unfamiliar with programming languages to interact with code. This paper proposes a novel method of translating unconstrained English into ASP programs for logic puzzles using an LLM and Abstract Meaning Representation (AMR) graphs. Everything from ASP rules, facts, and constraints is generated to fully represent and solve the desired problem. Example logic puzzles are used to demonstrate the capabilities of the system. While most current methods rely entirely on an LLM, our system minimizes the role of the LLM only to complete straightforward tasks. The LLM is used to simplify natural language sentences, identify keywords, and generate simple facts. The AMR graphs are then parsed from simplified language and used to generate ASP constraints systematically. The system successfully creates an entire ASP program that solves a combinatorial logic problem. This approach is a significant first step in creating a lighter-weight, explainable system that converts natural language to solve complex logic problems.

constraint, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2511.08715

Country: North America > United States (0.68)

Genre: Research Report > Promising Solution (0.48)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Steven Pinker's new book shows how he's become a contradictory figure

New ScientistSep-17-2025, 18:00:00 GMT

Steven Pinker's new book shows how he's become a contradictory figure Steven Pinker's new book When Everyone Knows That Everyone Knows makes a compelling case for common knowledge. Steven Pinker argues that "cancel culture" is a form of censorship Steven Pinker's new book perfectly encapsulates what a contradictory figure he has become. Much of it is a clear, fascinating explanation of a major psychological phenomenon . But then he starts telling you what he thinks about current affairs. Pinker is a psychologist at Harvard University who has written a string of popular science books. Some, like Words and Rules, are rooted in his own research and are a good read.

contradictory figure, knowledge, steven pinker, (12 more...)

New Scientist

Country:

Asia > Middle East > Iraq (0.15)
Asia > Mongolia (0.05)

Genre: Summary/Review (1.00)

Industry:

Health & Medicine (0.74)
Law (0.71)
Government (0.49)
Education > Health & Safety > School Nutrition (0.31)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

LogicPuzzleRL: Cultivating Robust Mathematical Reasoning in LLMs via Reinforcement Learning

Wong, Zhen Hao, Deng, Jingwen, He, Runming, Chen, Zirong, You, Qijie, Dong, Hejun, Liang, Hao, Shen, Chengyu, Cui, Bin, Zhang, Wentao

arXiv.org Artificial IntelligenceJun-6-2025

Large language models (LLMs) excel at many supervised tasks but often struggle with structured reasoning in unfamiliar settings. This discrepancy suggests that standard fine-tuning pipelines may instill narrow, domain-specific heuristics rather than fostering general-purpose thinking strategies. In this work, we propose a "play to learn" framework that fine-tunes LLMs through reinforcement learning on a suite of seven custom logic puzzles, each designed to cultivate distinct reasoning skills such as constraint propagation, spatial consistency, and symbolic deduction. Using a reinforcement learning setup with verifiable rewards, models receive binary feedback based on puzzle correctness, encouraging iterative, hypothesis-driven problem solving. We demonstrate that this training approach significantly improves out-of-distribution performance on a range of mathematical benchmarks, especially for mid-difficulty problems that require multi-step reasoning. Analyses across problem categories and difficulty levels reveal that puzzle training promotes transferable reasoning routines, strengthening algebraic manipulation, geometric inference, and combinatorial logic, while offering limited gains on rote or highly specialized tasks. These findings show that reinforcement learning over logic puzzles reshapes the internal reasoning of LLMs, enabling more robust and compositional generalization without relying on task-specific symbolic tools.

arxiv preprint arxiv, large language model, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2506.04821

Genre: Research Report (0.70)

Industry: Leisure & Entertainment > Games (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

VisualSphinx: Large-Scale Synthetic Vision Logic Puzzles for RL

Feng, Yichen, Xu, Zhangchen, Jiang, Fengqing, Li, Yuetai, Ramasubramanian, Bhaskar, Niu, Luyao, Lin, Bill Yuchen, Poovendran, Radha

arXiv.org Artificial IntelligenceJun-2-2025

Vision language models (VLMs) are expected to perform effective multimodal reasoning and make logically coherent decisions, which is critical to tasks such as diagram understanding and spatial problem solving. However, current VLM reasoning lacks large-scale and well-structured training datasets. To bridge this gap, we propose VisualSphinx, a first-of-its-kind large-scale synthetic visual logical reasoning training data. To tackle the challenge of image synthesis with grounding answers, we propose a rule-to-image synthesis pipeline, which extracts and expands puzzle rules from seed questions and generates the code of grounding synthesis image synthesis for puzzle sample assembly. Experiments demonstrate that VLM trained using GRPO on VisualSphinx benefit from logical coherence and readability of our dataset and exhibit improved performance on logical reasoning tasks. The enhanced reasoning capabilities developed from VisualSphinx also benefit other reasoning tasks such as algebraic reasoning, arithmetic reasoning and geometry reasoning.

large language model, machine learning, puzzle, (19 more...)

arXiv.org Artificial Intelligence

2505.23977

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Government > Regional Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(3 more...)

Add feedback

Causal language modeling can elicit search and reasoning capabilities on logic puzzles

Neural Information Processing SystemsMay-27-2025, 03:53:25 GMT

Causal language modeling using the Transformer architecture has yielded remarkable capabilities in Large Language Models (LLMs) over the last few years. However, the extent to which fundamental search and reasoning capabilities emerged within LLMs remains a topic of ongoing debate. In this work, we study if causal language modeling can learn a complex task such as solving Sudoku puzzles. To solve a Sudoku, the model is first required to search over all empty cells of the puzzle to decide on a cell to fill and then apply an appropriate strategy to fill the decided cell. Sometimes, the application of a strategy only results in thinning down the possible values in a cell rather than concluding the exact value of the cell.

artificial intelligence, large language model, natural language, (9 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games > Sudoku (0.54)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.87)

Add feedback

LR$^2$Bench: Evaluating Long-chain Reflective Reasoning Capabilities of Large Language Models via Constraint Satisfaction Problems

Chen, Jianghao, Wei, Zhenlin, Ren, Zhenjiang, Li, Ziyong, Zhang, Jiajun

arXiv.org Artificial IntelligenceMar-17-2025

Recent progress in o1-like models has significantly enhanced the reasoning abilities of Large Language Models (LLMs), empowering them to tackle increasingly complex tasks through reflection capabilities, such as making assumptions, backtracking, and self-refinement. However, effectively evaluating such reflection capabilities remains challenging due to the lack of appropriate benchmarks. To bridge this gap, we introduce LR$^2$Bench, a novel benchmark designed to evaluate the Long-chain Reflective Reasoning capabilities of LLMs. LR$^2$Bench comprises 850 samples across six Constraint Satisfaction Problems (CSPs) where reflective reasoning is crucial for deriving solutions that meet all given constraints. Each type of task focuses on distinct constraint patterns, such as knowledge-based, logical, and spatial constraints, providing a comprehensive evaluation of diverse problem-solving scenarios. We conduct extensive evaluation on both conventional models and o1-like models. Our experimental results reveal that even the most advanced reasoning-specific models, such as DeepSeek-R1 and OpenAI o1-preview, struggle with tasks in LR$^2$Bench, achieving an average Exact Match score of only 20.0% and 23.6%, respectively. These findings underscore the significant room for improvement in the reflective reasoning capabilities of current LLMs. The leaderboard of our benchmark is available at https://huggingface.co/spaces/UltraRonin/LR2Bench

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2502.17848

Country:

Asia > China > Shanghai > Shanghai (0.04)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
Asia > Vietnam > Hanoi > Hanoi (0.04)
(3 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Leisure & Entertainment > Sports (0.68)
Leisure & Entertainment > Games (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

Beyond Interpolation: Extrapolative Reasoning with Reinforcement Learning and Graph Neural Networks

Grillo, Niccolò, Toccaceli, Andrea, Mathys, Joël, Estermann, Benjamin, Fresca, Stefania, Wattenhofer, Roger

arXiv.org Artificial IntelligenceFeb-6-2025

Despite incredible progress, many neural architectures fail to properly generalize beyond their training distribution. As such, learning to reason in a correct and generalizable way is one of the current fundamental challenges in machine learning. In this respect, logic puzzles provide a great testbed, as we can fully understand and control the learning environment. Thus, they allow to evaluate performance on previously unseen, larger and more difficult puzzles that follow the same underlying rules. Since traditional approaches often struggle to represent such scalable logical structures, we propose to model these puzzles using a graph-based approach. Then, we investigate the key factors enabling the proposed models to learn generalizable solutions in a reinforcement learning setting. Our study focuses on the impact of the inductive bias of the architecture, different reward systems and the role of recurrent modeling in enabling sequential reasoning. Through extensive experiments, we demonstrate how these elements contribute to successful extrapolation on increasingly complex puzzles.These insights and frameworks offer a systematic way to design learning-based systems capable of generalizable reasoning beyond interpolation.

machine learning, puzzle, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2502.04402

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > Texas (0.04)
Europe > Norway > Western Norway > Vestland > Bergen (0.04)
Europe > Italy > Lombardy > Milan (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

AI tools like ChatGPT and Google's Gemini are 'irrational' and prone to making simple mistakes, study finds

Daily Mail - Science & techJun-4-2024, 23:01:55 GMT

While you might expect AI to be the epitome of cold, logical reasoning, researchers now suggest that they might be even more illogical than humans. Researchers from University College London put seven of the top AIs through a series of classic tests designed to test human reasoning. Even the best-performing AIs were found to be irrational and prone to simple mistakes, with most models getting the answer wrong more than half the time. However, the researchers also found that these models weren't irrational in same way as a human while some even refused to answer logic questions on'ethical grounds'. Olivia Macmillan-Scott, a PhD student at UCL and lead author on the paper, says: 'Based on the results of our study and other research on Large Language Models, it's safe to say that these models do not'think' like humans yet.'

large language model, machine learning, natural language, (20 more...)

Daily Mail - Science & tech

Country:

North America > United States (0.05)
Europe > Italy (0.05)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.33)
Health & Medicine > Therapeutic Area > Immunology (0.33)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Leveraging Large Language Models to Generate Answer Set Programs

Ishay, Adam, Yang, Zhun, Lee, Joohyung

arXiv.org Artificial IntelligenceJul-14-2023

Large language models (LLMs), such as GPT-3 and GPT-4, have demonstrated exceptional performance in various natural language processing tasks and have shown the ability to solve certain reasoning problems. However, their reasoning capabilities are limited and relatively shallow, despite the application of various prompting techniques. In contrast, formal logic is adept at handling complex reasoning, but translating natural language descriptions into formal logic is a challenging task that non-experts struggle with. This paper proposes a neuro-symbolic method that combines the strengths of large language models and answer set programming. Specifically, we employ an LLM to transform natural language descriptions of logic puzzles into answer set programs. We carefully design prompts for an LLM to convert natural language descriptions into answer set programs in a step by step manner. Surprisingly, with just a few in-context learning examples, LLMs can generate reasonably complex answer set programs. The majority of errors made are relatively simple and can be easily corrected by humans, thus enabling LLMs to effectively assist in the creation of answer set programs.

category, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2307.07699

Country:

Asia > Middle East > Oman (0.05)
South America > Bolivia (0.05)
South America > Argentina (0.05)
(4 more...)

Genre: Research Report (0.81)

Industry: Education > Educational Setting (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.74)

Add feedback

Exploiting Asymmetry in Logic Puzzles: Using ZDDs for Symbolic Model Checking Dynamic Epistemic Logic

Miedema, Daniel, Gattinger, Malvin

arXiv.org Artificial IntelligenceJul-11-2023

Binary decision diagrams (BDDs) are widely used to mitigate the state-explosion problem in model checking. A variation of BDDs are Zero-suppressed Decision Diagrams (ZDDs) which omit variables that must be false, instead of omitting variables that do not matter. We use ZDDs to symbolically encode Kripke models used in Dynamic Epistemic Logic, a framework to reason about knowledge and information dynamics in multi-agent systems. We compare the memory usage of different ZDD variants for three well-known examples from the literature: the Muddy Children, the Sum and Product puzzle and the Dining Cryptographers. Our implementation is based on the existing model checker SMCDEL and the CUDD library. Our results show that replacing BDDs with the right variant of ZDDs can significantly reduce memory usage. This suggests that ZDDs are a useful tool for model checking multi-agent systems.

artificial intelligence, decision diagram, zdd, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.4204/EPTCS.379.32

2307.05067

Country:

North America > United States > Colorado (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Africa > Cameroon > Far North Region > Maroua (0.04)

Genre: Research Report > New Finding (0.86)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback